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Abstract 

The Central Limit Theorem (CLT) for extrinsic and intrinsic means on 
manifolds is extended to a generalization of Frechet means. Examples are 
the Procrustes mean for 3D Kendall shapes as well as a mean introduced 
by Ziezold. This allows for one-sample tests previously not possible, and 
to numerically assess the 'inconsistency of the Procrustes mean' for a per- 
turbation model and 'inconsistency' within a model recently proposed for 
diffusion tensor imaging. Also it is shown that the CLT can be extended 
to mildly rank deficient diffusion tensors. An application to forestry gives 
the temporal evolution of Douglas fir tree stems tending strongly towards 
cylinders at early ages and tending away with increased competition. 
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1 Introduction 

The study of descriptors of geometrical objects encompassed by terms of form 
or shape be it for artisanry, biological and morphological interest or medical ap- 
plications dates back to the very origins of mankind. It took, and it still takes, 
however, until modern days to fully realize, that underlying most seemingly in- 
tuitive concepts of shape are rather counter-intuitive non-Euclidean geometries. 
Adding to counter-intuition, going from 2D to 3D shapes, these geometries cease 
to be well behaved. For this reason, the statistical analysis of 3D shapes is highly 
challenging and has gained much less attention in theory and practice. 
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This work has been motivated by a joint research with the Institute for 
Forest Biometry and Informatics at the University of Gottingen studying the 
temporal evolution of the 3D shape of tree boles over their growing periods. 
The task tackled here is to assess their growth towards and away from cylinders 
over time. This research is not only of interest in und erstanding fundamentals o f 
biological growth a n d sub sequent model building, cf. ISkatter and Hoibol(ll998l) : 
Koizumi and Hirai ( 20061) . Also, the deviation from cylindricity has a direct 
economical impact by reducing log volume and increa sing the numbe r of tu rns 
to reposition logs for commercial sawing processes, cf. iRappold et al. ( 2007f ). 

While Procru stes analysis is a well establ ished tool for the st atistical analy- 
sis of shape (e.g. Drvden and Mardia ( 1998 ) and Drvden ( 2009f ) for numerical 
routines), to the knowledge of the author, for 3D there are no asymptotic results 
available. Rather there is the belief that Procrustes sample means are unquali- 
fied for inference even in 2D because they may be "inconsistent estimators" of 
the "shape of the mean" - the latter being the mean of a p e rturbation model - 
unless the error is iso tropic ally distributed, see iLeld (|1993l ). Kent and Mardia 
1997 ) as well as Q ( 1998t h This perturbation model introduced by Goodall 



19911 ) and discussed in detail in Section[4](cf. dU on pageQl}, assumes error on 



the original configurations. The results cited above allowed inference based on 
Procrustes means only for a limited set of 2D scenarios, e.g. when randomness 
is caused by a procedure of acquiring landmarks independent of rotation and 
location. For more general settings as often occur in applications of biology, e.g. 
when randomness is also the result of non-uniform biological growth, however, 
Procrustes means could only contribute to data description while they were not 
considered for use in inferential statistics. 

Such mo re general applications in mind, iBhattacharva and Patrangenaru 
( 2003LI2005I) established an asymptotic theory for intrinsic and extrinsic means 
on manifolds allowing for non-parametric asymptotic inference. Here "non- 
parametric" stands for the general approach not silently relying on a mean 
given by a perturbation model in the configuration space. Curiously in 2D, 
Procrustes means coincide with extrinsic means, the latter being "con sistent" 
in the stati stical sense i.e. s atisfying a Strong Law of Large Number s, see lZiezold 
( 19771 ) .111 (119981) as well as Bhattach arva and Patrangenaru! (120031) . '. Due to the 
dominating paradigm of the perturbation model, it seems that the notion of a 
Procrustes population mean has not quite found its way into the community. 

Recently, a perturbation model has also motivated a non-standard approach 
in the context of neuro-imaging by lDryden et al. ( 2009b ). employing Procrustes 
analysis for estimating a mean diffusion tensor. Although "perturbation con- 
sistency" has not been established, practical imaging results in particular for 
nearly degenerate tensors have been of very good quality. 

It is the intention of this work to 

(a) make 3D Procrustes means with their well behaved asymptotics available 
for the practioners in the field, 

(b) to provide for an assessement of "inconsistency" with pertur bation models 
on Kendall's shape spaces as well as for the approach of iDrvden et al 
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(|2009al ) for diffusion tensor imaging, and 



(c) to derive a framework to obtain the temporal shape motion relative to the 
shape of cylinders for a sample of Douglas fir tree stems. 

To this end, the following Section [2] reviews Procrustes analysis and related 



results. Section [3] places Procrustes means and a mean introduced by IZiezold 



(|1994|) in the context of Frechet means and gives the Central-Limit-Theorem 



(CLT) as the main theoretical result. Here, two aspects require special atten- 
tion: first, a CLT can only hold on manifolds while in 3D, the underlying shape 
spaces cease to be that; secondly, sin ce 3D Procrustes means are neither intr in- 
sic nor extrinsic means, the CLT of Bhattac harva and Patrangenarul ( 2005 ) is 



accordingly modified as detailed in the appendix. 

In Section 0] it is clarified that the reported "(in) consistency" of Procrustes 
means expresses "(in)compatibility" of a perturbation model with the canonical 
shape space geometry. Moreover for 3D, simulations based on the CLT show 
that for isotropic errors the perturbation model can be considered nearly com- 
patible in most practical situations unless the shape of the perturbation mean is 
almost degenerate. In contrast by simulation in Section[Sl a similar perturbation 
model for mean diffusion tensors is not compatible, not even for small isotropic 
errors. Curiously, however, with increased degeneracy of the perturbation mean 
considered, incompatibility seems not increasing. This is also the case for an 
extension of the model to mildly rank deficient diffusion tensors, presented here. 

Section [6] introduces a framework for the assessment of shape of frusta cut 
from tree boles. It turns out that the shapes of cylinders form a geodesic 
in the shape space allowing for the computation of the distance of arbitrary 
frusta to the space of cylinders. For statistical inference, Boostrap confidence 
intervals for this distance are simulated from a sample of reconstructed tree 
ring structures of small size. Comparison with two typical growth scenarios 
gives that the bole shape of young trees with little competition grows uniformly 
or stronger towards cylinders, while the motion of shapes of old tree boles with 
heavy competition away from cylinders is similar to a growth maintaining cross- 
sectional ellipticality and tapering. 



2 Procrustes Analysis and Kendall's Shape Spaces 

In the statistical analysis of similarity shapes based on landmark configurations, 
geometrical m-dimensional objects (usually m = 2, 3) are studied by placing 
k > m landmarks at specific locations of each object. Each object is then 
described by a matrix in the space M(m, k) of m x k matrices, each of the k 
columns denoting an m-dimensional landmark vector, (x, y) := tv{xy T ) denotes 
the usual inner product with norm ||x|| = \J (x, x) . For convenience and without 
loss of generality for the considerations below, only centered configurations are 
considered. Centering in way treating all landmarks equally can be achieved by 
multiplying with a sub-Helmert matrix H 
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from the right, yielding x"H in M(m, k — 1). This corresponds to the relocating 
of objects in space such that their mean landmark is zero and isometrically 
projecting the linear sub space of matrices with vanishing mean landmark to 
the space of matrices with k — 1 of landmarks. For this method and other 
centering methods cf. iDrvden and Mardial (|l998l Chapter 2). Exclud ing also 
all matrices with all landmarks coinciding gives the space of configurations 

Ft := M(m,k -1)\{0}. 

Since only the similarity shape is of concern, all configurations are consid- 
ered modulo the group of similarity transformations H — SO(m) x M. + (recall 
that we excluded w.l.o.g. translations) where the action is given by the oper- 
ation (g, X)x — gXx. Here, SO(m) denotes the special orthogonal group (the 
orientation preserving orthogonal transformations). The shape of a configura- 
tion x £ F^ is the orbit [x] = {hx : h £ H}, the shape space is the quotient 
F^/H = {[x] : x € F^} with a suitable metric. 



A naive shape space. A straightforward metric structure for F^/H is given 
by the canonical quotient metric: 

inf \\h\Xi — hiXiW for x\,x% € . 

h 1 ,h 2 £H 

Unfortunately, due to the scaling action of R+, this metric is identically zero, a 
dead end for statistical ambition. 



General Procrustes analysis (GPA). As a workaround. iGowerl (|1975l ) in- 
troduced a constraining condition which allowed for the definition of a mean. 
For a sample of configurations Xi,...,x n € F^ a full Procrustes sample mean 
is given by the shape of 



1 

n 3 



3=1 

where h\, . . . , h* n are minimizers over hi, . . . , h n £ H of the Procrustes sum of 
squares 



\\hjXj — hiXi\\ 2 under the constraining condition 

»>j=i 
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Letting fi — i X^j=i ^j x ji fyi = (Si> ^ ne Procrustes sum of squares is 

n n n 

^ \\hjXj - h lXl \\ 2 = 2nJ2\\h J x j -^|| 2 = 2rc^] A "^l| 2 - 

ij'=l 3=1 3=1 

W.l.o.g. we may assume that all configurations are contained in the unit sphere 

S* := {x e M(m, k - 1) : \\x\\ = 1} . 

Then, any minimizing /i is the orthogonal projection of the mean of minimiz- 
ing hiXi, . . . , h n x n to S*j. In consequence, minimization can be performed 
sequentially, first for the Aj , then for the gj and finally for \x, as noted. Partial 
differentiation in particular gives the minimizing A* = (gjXj,fi) > (unless 
(gXj,fj.) = for all g € SO(m)) such that every fi* having the shape of a full 
Procrustes sample mean is a minimizer of 



J2 I' (9jXj,fj) 9jXj - /i|| 2 = min £ . 



mm mm 
gi,...,g„eSC 
(giXi,y) > o,i = l, . 

with the residual distance 



S(x,y) := min || (ffx, y)gx - j/|| . (2) 
9 e SO(m) 



Kendall's shape spaces. iKendalll (|19770 proposed a slightly different work- 
around. Instead of considering the quotient w.r.t. the action of R + , he projected 
all configurations to called the pre-shape sphere and considered the metric 
quotient w.r.t. SO(m) only, which is called Kendall's shape space 

:= S? n /SO(m) = {[x] : x G S k m } with the fiber [x] = {gx : g E SO{m)} . 

Kendall's shape space is thus a quotient of a sphere allowing for two non-trivial 
canonical quotient metrics 

rf £* (i x }> [y]) ■= "if d(gx,hy)= inf d(gx,y) , 

g,h£SO(m) g£SO(m) 

d denoting either the Euclidean distance on or the spherical distance on S^: 

dpl(x,y) := ||a: - y\\, d { ^(x,y) := 2 arcsin ^ - — . 

In some applications only the form, i.e. shape without size filtered out is 
of interest. The corresponding size- and- shape space is STj^ := F^/ SO(m) 
with size- and- shape [x] := {gx : g E SO(m)} E SH^ of x E F^. For a 
sample x±, . . . ,x n E F^, partial Procrustes sample means, the size-and-shapes 
of [i* E F^ are then considered which minimize 
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3=1 

over /i g i 7 ^ . Obviously, /x* = \'YTi=\9*j x i with minimizers <?*,...,<?* € 
SO(m). These means are "partial" because no equivalence w.r.t. scaling is con- 
sidered. In contrast, in the definition of full Procrustes means above equivalence 
under the full group of similarities is considered. 



2D full Procrustes means are extrinsic means. For m = 2, iKendalll 
( 19841 ) observed that one may identify F 2 fc with C fc_1 \ {0} such that every 



landmark column corresponds to a complex number. This means in particular 
that z e C fe_1 is a complex row- vector. With the Hermitian conjugate a* = 
{akj ) of a complex matrix a ~ (ajk) the pre-shape sphere Sf is identified with 
{z e C*" 1 : zz* = 1} on which SO (2) identified with S 1 = {A e C : |A| = 1} 
acts by complex scalar multiplication. Then the well known Hopf-Fibration 
gives Yi\ — S2/S 1 = CP k ~ 2 , the complex projective (k — 2)-dimensional space. 
Moreover, denoting by M(k — 1, k — 1, C) all complex (k — 1) X (k— 1) matrices, 
the Veronese- Whitney embedding is given by 

D:E| {a G M(k- l,k- 1,C) : a* = a}, [z] ^ z*z. 

If Z E C fc_1 is a random pre-shape, identifying D f EjQ w ith E^, the set of 
extrinsic means (cf. iBhattacharva and Patrangenaru (200|)) of [Z] is the set of 



shapes of the orthogonal projection of the usual expected value E(Z*Z) to t)(E|). 
Employing complex linear algebra, the set of extrinsic means is easily identified 
as the shapes of the eigenvectors to the largest eigenvalue of the complex integral 
of squares matrix K(Z*Z). Since on the other hand 

E(S(Z, w) 2 ) = 1 - wE(Z*Z)w* . 

with the residual shape distance from ([2]), we have that that any eigenvector of 
E(Z*Z) to its largest eigenvalue is a pre-shape of a full Procrustes mean. 

Theorem 2.1. The set of 2D full Procrustes means is the inverse image under 
the Veronese- Whitney embedding of the set of extrinsic means on t)(E|). 

In most applications, the largest eigenvalue of K(Z*Z) is simple, then the 
2D full Procrustes mean is uniquely determined. 

A similar procedure gives extrinsic means using the Schoenberg embedding for 
the related spac e of Kendall's reflect ion shapes of arbitrary dimension, not dis- 



cussed here, cf. IBhattacharva! (|2008). Procrustes means for three- and higher- 



dimensional configurations, however, cannot be modeled in this vein. 



3 Asymptotics for Procrustes and Ziezold Means 



In this section we will see th at 3D and hig her-dimensional Procrustes means as 
well as a mean introduced bv lZiezoldl (119941) are special cases of Frechet p-means. 
In the following, smooth means at least twice continuously differentiable. 
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3.1 Kendall's Higher-dimensional Shape Spaces 

As we have seen above, is a manifold, namely a complex projective space. 
Similarly, SY<™ can be given a manifold structure. For m > 3, however, and 
cease to be manifolds, they only contain dense and open submanifolds 



Pm)* := (S^)*/SO(m) with (S*)* 



{x £ S 1 ^ : rank(x) > m — 1} 



{x e : rank(a') > m — 1} 

valently non- degenerate shapes, 



(SE*,)* := (F^)*/SO(m) with (**)• 

called the manifold parts of regular or equi' 
size-and-shapes, pre-shapes and configurations, respectively. A pre-shape or 
configuration x or its shape or its size-and-shape [x] is called strictly regu- 
lar if rank(x) = rn. In case of Ej^, m > 3, approaching degenerate shapes, 
some sectiona l curva tures are unbound. A detailed disc ussion can be found in 
Kendall et al. I (Il999h as well as Huckemann et al. I ll201Cft . 



Note that the square of a distance may not be smooth at so called cut points. 
E.g the square of the spherical distance between e lt and e ls on the unit circle 
S 1 C C is smooth except when e^ t_s ^ = — 1. In general on a Riemannian 
manifold, the cut locus C(p) of p comprises all points q such that the extension 
of a length minimizing geodesic joining p with q is no longer minimizing beyond 
q. If q € C(p) or p € C(g) then p and g are cirf points. 

Lemma and Definition 3.1. All of 

4$ (M, [y]) : = mhi 9GiSC) (m) dW ( fla!i y ) i 

4ff[(M>M) : = sin 4$ (Hi M) = min g e 50(m) V 1 - (fi^S/) 2 , 

(ffz, j/} > 

, z d%l ([*],[»]) 

4i*(H)[f]) : = 2sin— m-g = min geSO(m) ^2(1 - (p:E,y» , 

(H'tfl) : = min 9 GSO(m) d^l(gx,y) = min 9SSO(m) VIMI 2 + ll</ll 2 ~ 2(ffa;,3/) 

are metrics on the shape space and the size-and-shape space, respectively, and 
their squares are smooth on the respective manifold parts except at cut points. 

Proof. If the canonical quotient Q = M/G of a smooth Riemannian manifold 
M due to the action of a Lie group G = SO(m) on M is a Riemannian manifold, 
then the intrinsic distance on Q is a smooth metric (cf. lAbraham and Marsden 



( 19781 Chapter 4.1)). This gives the smoothness of (d^) 2 on the respective 



manifold parts which are dense in the shape space and the size-and-shape space, 
respectively, except at cut points. Hence, the d: extend to metrics on these. 
Moreover, d s i is a metric due to convexity and monotonicity of t — > 2sin(i/2) 

for t € [0, tt]; for d^) we rely on lKendall et al.l (|l999l p. 206). □ 



We say that x is in optimal position to y if maXg G 5 ( m ) (gx, y) = (x,y). In 
particular then d^l {[x], [y]) — \\x — y\\ and d^ k ([x], [y]) — \\x — y\\, respectively. 

3.2 Frechet p-Means 

The Central-Limit-Theorem (CLT) derived below relies on the "5-method" for 
smooth transformations. In consequence, we can only expect a CLT theorem 



7 



to hold on the manifold part of the shape space and the size-and-shape space, 
respectively. For the following we assume that (Q,d) is a metric space and P 
is a £>-dimensional smooth Riemannian manifold. For example, Q = and 
P = (S^j)*. Suppose that X, Xi, X2, ■ ■ ■ are i.i.d. random variables mapping 
from an abstract probability space (f2, A, P) to Q equipped with its Borel a- field. 
For a random vector Y € M. D , E(F) denotes the usual expectation, if defined. 

Definition 3.2. For a continuous function p : Q x P — > [0,oo) define the set 
of population Frechet p-means of X in P by 

£W(X) = argmm MeP E(p(X,,i) 2 ). 

For uj £ fl denote the set of sample Frechet p-means by 

n 

E ( n P) M = argmin^p p( X i M » A*) 2 • 



Since their original definition by iFrechet (.1948) for P = Q and p = d, 
such means have found much interest. IZiezoldl (|1977l ) extended the concept to 
p being a quasi-metr ic only. On Riemannian manifolds (Q = P) w.r.t. the 



Riemannian metric p, iBhattacharva and Patrangenarul ( 20031 120051 ) introduced 
the corresponding means as intrinsic means, and, taking p to be the metric of 
an ambient Euclidean space, as extrinsic means. 

In consequence of Lemma and Def . 13.11 and the connection with the residual 
distance ©, d^l ([x], [y]) = 6(x,y), we can at once identify Procrustes means. 

Corollary 3.3. The set of p-Frechet sample means on the shape space and 
size-and-shape space, respectively, are the sets of 

7 (p) 



(i) full Procrustes sample means for p 
(ii) partial Procrustes sample means for p 



Ai) 



For m = 2, Ziezoldl (1994) introduced mean shapes w.r.t. p = d^l which 



have been studied also by|LeJ (|1998l ). The following first two definitions honor 
his memory. The other extend sample Procrustes means to the population case. 

Definition 3.4. Call d^l the Ziezold metric on the shape space, and the Frechet 



d^l -means the set of Ziezold means. The Frechet d^l -means are the set of full 

m , (i) m 

Procrustes means, the Frechet d s ^ k -means the set of partial Procrustes means. 

Ziezold means and full Procrustes means, even though defined earlier, can 
be thoug ht of as a generalization of extrinsic means and means for crude resid- 
uals (cf. iMardia and JupdI ( 2000f )). respectively. Partial Procrustes means are 
extensions of intrinsic means to non-manifolds. 

Let us now touch on the issues of existence and uniqueness for the above 
introduced means. Both are are rather simple issues for extrinsic means, since 



(p) 
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they are orthogonal projections of classical Euclidean means in an ambient space 
to the embedded manifold in question , this projection being well defined exce pt 
for a set of Lebesgue measure zero (cf. iBhattacharva and Patrangenaru ( 2003f) ). 
For non-extrinsic means as above, however, existence, namel y that the means ar e 
assumed on the manifold part is a deeper issue tackled in Huckemannl ( 2010l ). 
Intrinsic means are u nique if the underlying random elements are sufficiently 
concentrated (cf. |]3 ( 2001 ) for the elaborate proof). For higher-dimensional 
Ziezold and full Procrustes means, to the knowledge of the author, there are no 
uniqueness results available. 

Since we ar e concerned w ith metri cs, the following is a consequence of th e 
Strong Law by IZiezold ( 1977 ). cf. also Bhattacharva and Patrangenaru ( 2003f ). 



Theorem 3.5. Suppose that X is a random pre-shape in or a random 
configuration in with unique Ziezold, full Procrustes or partial Procrustes 
mean [fi]. Then every measurable selection [fi„ (uj)] from the sets of sample 
Ziezold, full Procrustes or partial Procrustes means, respectively, is a strongly 
consistent estimator of [fi] . 

For the following we need the condition that [X] is bounded away from the 
cut locus of the mean a.s. in the Ziezold or full/partial Procrustes sense 



3e>0 such that d([X],[(i]) < d(C(\fi]),[(i]) 



e a.s. 



(3) 



where \p] and d are corresponding Ziezold, full Procrustes or partial Procrustes 
means and distances, respectively. 



3.3 The Central-Limit-Theorem 

The Central-Limit-Theorem for Frechet p-means requires additional setup. 

Definition 3.6. Let P be a D-dimensional manifold. We say that a P-valued 
estimator n n (u)) of [i € P satisfies a Central-Limit-Theorem (CLT), if in any 
local chart (<fi,U) near /i = (^ _1 (0) there are a suitable D x D matrix and 
a Gaussian D x D matrix Q§ with zero mean and semi- definite symmetric co- 
variance matrix such that 

y/nA^((j)(n n ) - (j>(n)) -> 

in distribution as n — > oo . 

In most applications is non-singular, then in consequence of the "6- 
method", for any other chart (<f)', U) near [i = (/>'~ 1 (0) we have simply 



a~}^{a~}) t = j{cp> o ^oA-^iA-yj^ o r 1 ) 



1\T 




where J(-)o denotes the Jacobian matrix of first derivatives at the origin. 

In consequence of T heorem 13.51 and Theore m IA.1I fro m the appendix (as- 
sertion (ii) follows from Bhattacharva and Patrangenaru ( 20051 )) we have the 
following. 
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Theorem 3.7. Suppose that X is a random pre-shape in or a random 
configuration in F^ t , [X] having a unique Ziezold, full Procrustes or partial 
Procrustes mean [fj] on the respective manifold part. Then every measurable 
selection \p, n (oj)] from the sets of sample Ziezold, full Procrustes or partial Pro- 
crustes means, respectively, satisfies a Central- Limit- Theorem if [X] is bounded 
away from the cut locus of the mean a.s. in the Ziezold or full/partial Procrustes 
sense 0) under the following additional condition: 

(i) none in case of Ziezold or full Procrustes means, 

(ii) in case of partial Procrustes means, if the Euclidean second moment E(\\X\\ 2 ) 
is finite. 

In a suitable chart (</>, U) the corresponding matrices from Definition \ S.6\ are 
given by 

A = E(Hp([X], [//])), E = COV(grad/>([X], [//])) 

where p denotes the distances d^l , d^l and d^ k , respectively. Moreover, grad 
and H denote the gradient and Hessian of x n- /o([X], [if)^ 1 (x)}) 2 , respectively. 

Numerical simulations show a rather good finite sample accuracy of the CLT, 
cf. Figure [TJ 




Figure 1: Depicting the proximity to the 
standard normal density ( dashed line ) of the 
normalized density (solid line) of the non- 
zero coordinate of full Procrustes means of 
samples of size n = 10 from the geodesic 
perturbation model fP[) in Eg with = and 
Si ~ N(0, 0.1 2 ) with the two-dimensional 
mean /i = diag(l, — 1, 0)/v / 2'H T and v — 
diag(l,l,l)/\/3H T . 



Remark 3.8 . For m > 3 7 computing Ziezold means (for an algorithm cf. 
Ziezold \199A )) is computationally less costly tha n computing full Pro c rustes 



means ( corresponding algorithms are discussed in \Dryden and Mardid \l99& . 
Chapter 5.3)): for full Procrustes means in every iteration step every single op- 
timal positioned datum needs additionally to be projected to the tangent space. 
The simulations reported in Section and the Bootstrap simulations in Sec- 
tion^ give similar results but are approximately 15% faster when using Ziezold 
means instead of full Procrustes means. 



Remark 3.9. \Grossiei\ >!200d) proves that the above algorithms converge under 



rather broad conditions and supplies error estimates. 
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4 Perturbation Inconsistency for Kendall Shapes 



For brevity in this section unless otherwise specified, Procrustes means refer to 
full Procrus t es me ans. 

iGoodalll (Il99lh proposed to model a sample of landmark configuration ma- 



trices yi (i — l,...,n) with a perturbation model that since then, has been 
highly popular in the community: 

Vi = AiPi(/x + ei) + *,. (4) 

The Xi > convey scaling, the gi are rotation matrices and the ti stand for 
translations. Obviously, these three sets of nuisance parameters keep the shape 
invariant, of interest is only the deterministic perturbation mean /i and the 
random perturbations ti with zero expectation. If the size is also of interest then 
set Xi — 1 keeping only the form invariant under rotation and translation. Under 
the assumption of isotropic Gaussian errors gj, for estimation a nd inference on 
the population perturbation means /i^ and error covariances e^, Goodall ( 199ll ) 



proposed to use sample Procrustes means obtained from GPA, cf. Section [2j 
In the sequel, the property that (for arbitrary errors) sample Procrustes means 
converge to the shape of the mean of an underlying perturbation model has 
been coined as the "consistency of Procrustes means" . 

4.1 Isotropic 3D Error 



In order to validate Goodall's proposal, iKent and Mardial (|1997l) studied the 
analog of the perturbation model (f?]) on the pre-shape sphere and showed that 
for 2D configurations with isotropic Ci, the Procrustes po pulatio n mean from ([?]) 
is identical with the shape of the perturbation mean. iLel (|l998l) extended these 
results and showed that under slightly relaxed conditions for 2D configurations, 
intrinsic, Ziezold and Procrustes means all agree with the shape of fi. 

For 3D shapes the above arguments are no longer valid because the shape- 
fibres in are not spanned by geodesies in general (see iHuckemann et al 



( 2010l Example 5.1)), hence equality of Procrustes means with the shape of a 



perturbation model cannot be expected, even for very small isotropic error. 

In order to assess the practical impact of this effect, we measure the distance 
between the shape of the perturbation mean and its corresponding Procrustes 
mean. In view of Theorem 13.71 this can be done by determining the distance 
between the shape of the perturbation mean and a corresponding sampled Pro- 
crustes mean while confidence or equivalently its accuracy can be estimated by 
distances of correspondingly sampled sample Procrustes means to their sample 
Procrustes mean. The following considerations detail this setup. 

Consider a random X — YT-i / \ \ VH | £ S*, with Y from a perturbation model 
(01 such that the following means are unique. Assume fi £ and denote by 

v the pre-shape of the Procrustes population mean in optimal position to fi; 

the random pre-shape in optimal position to v of the Procrustes sample 
mean of X\ , . . . , X n i.i.d. as X; 
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1 
0.3 
0.1 




0.01 




0.5 
0.1 
0.01 



0.5 
0.1 
0.01 



0.1 
0.01 
0.001 



0.011 
0.0023 
0.00022 



0.30 
0.016 
0.00026 



0.12 
0.0055 
5.0e - 05 



£(n,2V) 

0.0097 
0.0021 
0.00020 



0.030 
0.0022 
0.00020 



0.082 
0.00044 
2.0-05 



Table 1: Estimated distance d( n ' N ^ between shape of the mean [i of the pertur- 
bation model and corresponding Procrustes mean with standard error a^ n ' N ' 
for n = 10, 000 and N = 10. The error follows independently J\f(Q, a ) in every 
component. For convenience the Helmertized pre-shape /jH is reported display- 
ing proximity to degeneracy. 



v~f^ for j = 1, . . . , N i.i.d. realizations of v^ n \ 



p( n > N ) a realization of the Procrustes sample mean of p[ , . . . , in optimal 
position to v. 

Then in consequence of Theorem 13.71 



(n,N) 



\ 



N 



gives an approximation of the confidence into or accuracy of the measurement 
of v by #( n ). On the other hand, we have the approximation 



df"- N > := 



\ 



1 



JV 



,(p) 

3=1 



<£2 (W) 



(«)i 



The goodness of both approximations depends on AT. Moreover for fixed A'", 
a {n ^ can be made arbitrarily small by choosing n sufficiently large. If along- 
side, d( n ' N ^ also becomes equally small, this can be taken as strong evidence that 
the shape of the perturbation model agrees with the corresponding Procrustes 
mean. Otherwise, we have strong evidence, that the two disagree. Table [1] re- 
ports values of d^ n ' N ' and a^ n ' N ' for typical simulations of ((H) for m = 3, k = 4 
with isotropic i.i.d Gaussian error e of variance a 2 . The following remark sum- 
marizes the results of these and further simulations. 

Remark 4.1. In approximation the perturbation model with isotropic errors 
is compatible with the geometry of £3 if the perturbation mean is far from de- 
generate. For nearly degenerate perturbation means, however, the perturbation 
model may be incompatible even for small isotropic errors. Using Ziezold means 
instead of Procrustes means gives similar results. 
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4.2 Non-Isotropic 2D Error 

Global effects. Recall from Section^ modeling E| by a complex projective 
space. Since the eigenvectors of the complex integral of square s matrix may vary 
discon tinuously under continuous matrix variation this led iKent and Mardia 
(1997) to the conclusion that Procrustes means can be inconsistent where in 
fact they are statistically consistent but possibly discontinuous. 





Figure 2: Great circle in 
£2 spanned by the shapes 
of the perturbation {5[). 
The little circles denote 
100 sampled shapes, the 
star their Procrustes sample 
mean and the triangle the 
respective Procrustes popu- 
lation mean. The respec- 
tive headers give the stan- 
dard deviation of the Gaus- 



In view of the perturbation model (j4]) consider in the spirit of Kent and Mardial 
(|l997l p. 285) the complex configuration /1 = (\/2, 0, l/\/2) £ C 3 additively per- 
turbed by the non-isotropic random e = (0, 0, \/&rjt/2) e C 3 , t ~ 7V(0, 1), rj > 0, 
modeling linear configurations with two fixed endpoints and a random landmark 
varying in the middle. The Kendall pre-shape of 



is given by z 



V 1 + j ' 2i: 



M + e (5) 
= (l,7?i) with the complex integral of squares matrix 











For small error intensity n, the shape of \x gives the Procrustes population 
mean. For higher error intensity, however, the Procrustes population mean 
indeed changes abruptly to the shape of the configuration v — v2j — 2v2)- 
In order to visualize the situation in the shape space recall the Hopf fibration 



S, 3 



£3 = £p 1 ■ (ai,a 2 ) 1 ^ ( Re(aioi),Im(aioi) 



a 2 



mapping from a three-sphere to a two-sphere. The counter-intuitive behavior of 
discontinuity of the Procrustes mean is visualized in Figure [2j As is clearly vis- 
ible, the discontinuity of the Procrustes mean in the error's standard deviation 
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V&r}/2 is due to the fact, that the corresponding shapes move along a common 
great circle in S|, from clustering at the top (the shape of u) to clustering at 
the bo ttom (the shape of v) . The discontinuity observed by iKent and Mardia 



(I1997T ) thus reflects a global effect of an "incompatibility" of the perturbation 



model with the geometry of the shape space. 



Local effects. As above, iLeiel (1993'i Figure 3 on p. 598) considers an example 



of a distribution of planar but now quadrangular configurations along a straight 
line in the configuration space For simplicity of the argument consider 
/i = {i, 1, — i, —1), e = (0, i, 0, — i) and the sample z\ = jJb + e, z-2 = fi — e , 
e > 0, with mean /i in Euclidean F$. One verifies immediately that Z\ and 
Z2 are not in optimal position, rather A = ^g(l + 2i) puts z\ into optimal 
position \z\ to Z2 w.r.t. to the action of SO(2). Then the form of (Azi + Z2)/2 
is the partial Procrustes mean, which is different from the form of fi. The 
alleged "inconsistency" of the partial Procrustes mean is now a local effect of 
an "incompatibility" of the perturbation model with the canonical geometry of 
the size-and-shape space STlf- 



5 Asymptotics for Diffusion Tensors 



In d iffusion tensor n e uro-im aging (for a short introduction into this young field, 
e.g. IVilanova et all (|2006t l). the dominating eigenvector of a symmetric semi- 
positive definite (SPD) matrix from the space P(m) = {0 ^ a £ M(to,to) : 
a T = a > 0} exhibits the dominating direction of molecular displacement due 
to a flow in tissue fibres of interest. The statistical and non-statistical literature 
to the task of reconstructing SPD matrices which are called diffusion tensors 
(DTs) in this context is vast. Recently, matching techniques involving shape 
analysis (e.g. Cao et al. ( 20061 )) have gained momentum. 

A non-standard statistical approach to reconstruct a mean DT from an 
observed sa mple of neighboring D Ts based on Procrustes methods has been 
proposed by Drvden et al. ( 2009bl ). One motivation comes from perturbation 
models, which, as it turns out below are "inconsistent" with Procrustes means. 
Coming as a surprise, however, practical image reconstructions based on this 
method in particular for nearly degenerate DT s, have produced results of con- 
vincingly good quality, cf. iDrvden et al. ( 2009a ). In the following, this approach 
is first extended to mildly rank- deficient DTs and secondly, perturbation incon- 
sistency is assessed. Even though in most applications to imaging, the flow is 
observed in 2D or 3D, i.e. m = 2, 3, the following applies to arbitrary to G N. 

Diffusion tensors observed in the "real world" fall into the sub-space P + (to) = 
{a G P(m) : a > 0} of symmetric strictly positive definite matrices. This space 
- even though much more complicated than complex projective space - has 
been well studied (as the "universal symmetric spac e of non-com pact type with 
non-positive non-constant sectional curvatures" e.g. Land ( 19991 . Chapter XII)). 
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Via the well known Cholesky factorization 

chol : P+(m) -> UT + (m) , 

this space can equivalently be modelled by the group of m-dimensional up- 
per triangular matrices UT + (m) with positive diagonal. Traditionally diffusion 
tensors have been modelled within these spaces, statistical analysis has either 
been carried out by using the extrinsic distance inherited from M(m,m) (e.g. 
Paievic and Basserl (12003 )) or the intrin sic distance due to the aforementioned 



structure (cf. iFletcher and Joshil (|2004h ). Modeling flow in ideal micro-fibres, 
obviously leads to rank-deficient diffusion tensors, which, however, are not con- 
tained in these manifolds. Unless one utilizes a Euclidean embedding e.g. in 
the space of symmetric matrices, a new structure has to be found. 

5.1 A CLT for Mildly Rank-deficient Diffusion Tensors 

Here, an embedding P(m) is proposed, that maps the space 

P*(m) = {a G P(m) : rank(a) > m — 1} 

of mildly rank- deficient diffusion tensors into the manif o ld part This 
embedding is inspired by recent work of iDrvden et al. (2009a) who, in effect, 



model P + (m) as a subspace of More subtly one could model as well 

using Kendall's reflection size- and- shape space. In fact, modeling within size- 
and-shape space is equivalent to embedding reflection size-and-shape space in 

size-and-shape space. 

To this end consider the following canonical domain for upper triangular 
matrices 

UT(m) 

= | a £ M(m, m) : there is 1 < ig < m such that for 

i = 1 : there is m > j\ > 1 with Ojjj > and an = for all 1 < I < j\ 
2 < i < io : there is m > ji > ji-i + 1 with an = for all 1 < I < ji, and 

a 

io < i ■ aij = for all j = 1, , m 

the sphere SUT{m) = {a G UT(m) : \\a\\ = 1}, UT^(m) = {a G UT{m) : 
a mm > 0} and SUT^(m) = SUT(m) n UT^(m). Moreover, with a bijective 
extension of the Cholesky factorization, the corresponding canonical projections 
from Section[2] ir and sir, s(x) = \\x\\ and bijections <fi, 4> s , consider the following 
diagram of mappings: 

P{m) 
$chol 

UT>(m) SO(m)xUT(m) ( - sl ' a )? sa ^ S™ +1 x ^ 

I sir X4> 
STZ +1 SUT^(m) 

t<Ps 

UT^(m) 

In particular, ([5]) gives rise to the following mappings 

t s : P(m) -> SX™ +1 and r : P(m) -> . 



(6) 
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Theorem 5.1. The mappings t s and r are well defined, open and continuous. 
Moreover 

(i) t s restricted to the space of mildly rank- deficient diffusion tensors P*(m) 
is an infective mapping into the manifold part (S'I]™ +1 )* of Kendall's size- 
and-shape space. Restricted to the space of full rank diffusion tensors it is 
a diffeomorphism onto an open subset. 

(ii) t restricted to the space of mildly rank- deficient diffusion tensors P*(m) 
maps into the manifold part Restricted to the space of full rank 

diffusion tensors it is a submersion of codimension 1 onto an open subset. 

For the proof of Theorem 15. II and diagram ([6]) we refer to the appendix. 

Remark 5.2. In consequence, inference for mildly deficient-rank diffusion ten- 
sors can be carried out via t s or r utilizing the CLT: Theorem \3.7\ Moreover, 
mean shapes can be pulled back under (t s ) _1 to obtain mean diffusion tensors if 
they stay away from the region corresponding to a mm < in UT(m). 



5.2 Perturbation Inconsistency for Diffusion Tensors 

A typical perturbation model considered bv lDrvden et al. ( 2009a ) is 

Xi = (/i + e 1 ) T (M + e- i ) (7) 

with perturbation mean /i G UT(m) and error et following a Gaussian distri- 
bution A/"(0,er 2 ) independently in every component or in every upper diagonal 
component and zero in every strictly lower diagonal component. In order to 
relate the quality of the embedding t s from © to other structure s for the space 
of diffu sion tensors (e.g. that of the universal symmetric space) . iDrvden et al 



(2009aj) compare the partial Procrustes distance between t s (/j, t fi) and partial 
Procrustes means from samples of ([7|). 

For illustration we report in Table [2] a simulation in analogy to Section 14.1 
(using partial Procrustes means instead of full Procrustes means, cf. Table [1} 
for three typical diffusion tensors for upper triangular isotropic errors. The 
situation is similar for isotropic errors. The values of S n ' N) for a = 0.1 m our 
Table[2]correspond to the "RMSE(d s ) of E s " reported in the blocks labelled "II" 
in "Table 2" (corresp onding to our second block) and "Table 3" (corresponding 
to our third block) of Drvden et al.l (|2009al ). Here, however, we used n = 10, 000 



and identified these numbers as a measure only for perturbation inconsistency 
by additionally computing the standard error. We remark the consequence. 

Remark 5.3. Suppose that X follows a perturbation model ([?]). For upper trian- 
gular perturbation mean \i and error e this is an anisotropic perturbation model 
for ST<™ +1 . For e independent Gaussian in every component, E(chol(JT)) ^ 
chol(/i) in general (cf. Lemma \A.S\ in the appendix). Simulations corroborate 
inconsistency with the geometry of SH\ . Surprisingly it seems that incompati- 
bility is not increasing, near degeneracy. This observation may be taken as an 
explanation for successful modeling of nearly rank deficient diffusion tensors via 
Kendall's size- and- shape spaces and certainly deserves further research. 
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1 
0.3 
0.1 



1 
0.01 




0.1 
0.01 
0.001 



0.1 
0.01 
0.001 



0.1 
0.01 
0.001 



0.12 
0.012 
0.0012 



0.0017 
0.00017 
1.6e- 05 



0.023 
0.00026 
2.4e - 05 



0.0020 
0.00023 
2.5e- 05 



0.14 
0.0085 
0.00081 



0.0019 
0.00021 
2.2-05 



Table 2: Estimated distance d^ n ' N ' between size- and- shape of /i from the pertur- 
bation model and corresponding partial Procrustes mean with standard error 
a-(n,N) j or n _ 20,000 and N = 10. The error follows independently Af(Q,o~ 2 ) 
in every upper triangular component. 



6 Assessing Tree Bole Cylindricity 

We conclude with an application from forest biometry. As basic descriptors 
for the shape of tree boles, taper curves relate height above ground level with 
the area of a cross section at that height. The shape of a tree stem is thus 
described by a two-dimensional curve. Em pirical cu r ves ca n be directly an- 
alyzed with methods of shape analysis, e.g. iKrepelal ([20021 ) or, sophisticated 
mod els based on biologic a l and ela stomechanical contex t can be sought for, 
e.g. IChiba and Shinozakil dl994|) or iGaffrev et al.l (Il998l) . For a discussion of 
current taper curve models cf. iLi and Weiskittell ( 201o( ). While taper curves, 
following Cavalieri's principle, model tree stems b y circular frusta, recently b ole 



ellipticality has be e n stu died more close ly, e.g. Skatter and Hoibol (|l998l ) or 



iKoizumi and Hiral ( 2006 ). In particular, Rappold et al.l ( 2007 ) model 3D logs 



by elliptical frusta and discuss the impact of their findings on commercial log- 
ging: proximity to circular frusta increases the volume produced and reduces 
the number of turns to reposition logs in sawing machines and thus the total 
sawing time. 



6.1 Data Description 

Usually, tree stems are divided into three different parts: a short neiloid bottom 
part with strong tapering connecting with the root system, the main bole with 
little tapering - which is of prime commercial interest - and a conical top (e.g. 
Li and Weiskitte] (2010)). Typically a butt log that is a frustum taken above 
breast height (1.3 m) is used to assess bole quality. For our application, we use 
1 m butt logs from five Douglas fir trees typica l for the inside of a small ex peri- 
mcntal stand in the Netherlands as detailed bv lGaffrev and Slobodal (l200lh . At 
about the age of 10 to 15 years, tree crowns met; subsequently with almost no 
thinning, competition for light increased strongly. The entire ring structure of 
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Figure 3: Distorted view (the center frustum in the bottom row has a diameter 
of approx. 50 cm) along 36 angles of 1 m butt logs of five Douglas firs at ages 
8 years (top row), 30 years (middle row) and 62 years (bottom row). 

bottom and top disk has been elaborately reconstructed along 36 equally spaced 
angles allowing to reconstruct the butt logs for every age beginning from 8 years 
to 62 years as displayed in Figure [3] for early, intermediate and ultimate age. 

6.2 Elliptical-Like Frusta: A Totally Geodesic Subspace 

In order assess the deviation of a 1 m frustum from a cylinder (of unknown 
dimensions), we compute the shape distance to the space of all cylinders. To 
this end, and in order to compare with a model building on elliptical frusta, we 
introduce a suitable subspace of configurations and shapes. 
For given a T , b T , 1 T = (1, . . . , 1) T G M K with 

(a,b) =0= (a,l> = (b,l) (8) 
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and r, a, t > let 




Wa,P,r,t 



be the configuration of an elliptical-like unit height frustum. In case of a = 
( cos(27t/k), . . . , cos(27tk/k)) and b = (sin(27r//t), . . . , sin(27r«;/fi;)) we have an 
elliptical frustum of mean radius r, tapering t with bottom half axes of length 
r, ra and top half axes of length tr, tr/3. A straight frustum has t = 1, a circular 
frustum has a = 1 = j3 a nd a cylinder is a straight c ircular frustum. On the 
grounds of the findings of IChiba and Shinozakil (jl994h . that for a large middle 



part of the bole, there is little shape variation when moving upward, a possible 
torsion between top and bottom ellipse is neglected in this model. Denote by 



l'S ah : { Tj^jj- : w G P a ,b } s ' 



2 k 



the pre-shapes of all elliptical-like frusta determined by a, b G M. K satisfying ©. 
Here 

P^b ■■= {w a ,^ r ,m : a,/3,r,t > 0} c F| K 

and "H denotes the Helmert sub- matrix from ([1]) in Section [2j 

Recall that a submanifold PcMis totally geodesic if for any two points in 
P any minimal geodesic segment joining the two in M is contained in P. 

Theorem 6.1. Consider a T ,b T £ M. K satisfying (0). Then the shapes of all 
elliptical-like frusta form a totally geodesic submanifold of (Y^)* with horizontal 
lift PS^b to S% R . 

The proof of Theorem 16.11 is found in the appendix. We have at once: 
Corollary 6.2. The shapes of cylinders form a segment on a geodesic in (Eg")* . 

6.3 Data Analysis 

The distance of the shape of a frustum to the g eodesic spanned by the sh apes of 
cylinders is computed via a method proposed in lHuckemann et al. I (l2010h . bmce 



for every age considered there are only five frusta, a 95 % confidence band for 
the distance of the full Procrustes mean to the geodesic of cylinders has been 
computed by 200 Bootstrap resamples. As clearly visible in Figure IU young 
frusta until the age of approx. 15 years tend to toward cylinders. At later ages 
they tend away again. The change point at approx. 15 years can be explained 
by the fact that at this time competition for light began. 

Note that uniform growth naturally decreases tapering and ellipticality. In 
order to compare the initial growth to uniform growth, three curves of elliptical 
frusta starting with t and a = (3 in the range of corresponding estimated butt 
log values as well as with size growth identical to the mean size growth are 
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depicted in the left display of Figure 0] (95 % of estimated tapering coefficients t 
observed are between 0.864 and 0.986; a crude estimate for a and /3 gives 95 % 
of observed values between 1.02 and 1.13 (after rotating such that a,/3 > 1)). 
This comparison yields that observed initial growth towards cylinders appears 
even stronger than uniform. 

In order to assess the growth of the older logs, compare it to the growth of 
elliptical frusta keeping tapering and ellipticality constant while letting size grow 
with the mean size of the original data. Three of such curves are depicted in 
the right display of Figure SJ The observed movement away from the geodesies 
of cylinders appears rather similar. 




oc505 aMo oM5 M20 a325 acJoB aMo otJTs MSo a325 

distance to geodesic of cylinders distance to geodesic of cylinders 



Figure 4: Solid: confidence band for the distance of the Procrustes mean 
shape of 1 m butt logs above breast height to the geodesic of cylinders for 
each year based on 200 boostrapped Procrustes means. For comparison dashed: 
uniform equalsize growth of elliptical cylinders (left display) starting with 
(a = p,t) = (1.07, 0.93), (1.1, 0.9), (1.15, 0.85) from left to right; equalsize 
growth of elliptical cylinders (right display) with constant (a = /3,t) — 
(1.025, 0.975), (1.05, 0.95), (1.1, 0.9) from left to right. 

Summarizing, this study indicates that tree boles of young Douglas fir trees 
with little competition grow uniformly or even stronger towards cylinders. Older 
tree boles their crowns competing for light, grow as if they keep ellipticality and 
tapering constant. These findings certainly call for more elaborate research, e.g. 
tapering and ellipticality can be investigated by developing a method to study 
distance and projection to the submanifold of elliptical-like frusta. 



7 Conclusion and Outlook 

In this paper for the statistical analysis of 3D shape, an asymptotic result for 
mean shape has been derived, a classical perturbation model and a newly pro- 
posed perturbation model for the statistical analysis of diffusion tensors has 
been revisited, and inference on the temporal deviation of the 3D shape of tree 
boles from cylinders has been performed. 
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Although Procrustes means are very popular, it seems that due to a misun- 
derstanding, the notion of 3D Procrustes population means had not been quite 
available in the community. Instead, population means of a perturbation model 
had been estimated by Procrustes sample means which were, as was well known, 
"consistent" estimators under isotropic 2D errors. Introducing Frechet p-means 
in this work and extending the available Central-Limit Theorem to underlying 
non-metrical distances, in particular, the issue of "consistency" has been iden- 
tified as an "incompatibility" of the perturbation model with th e shape s pace's 
geometry. Moreover, we recalled a Frechet p-mean introduced bv lZiezoldl (|l994T l 
the computation of which in 3D is computationally slightly less costly than the 
computation of Procrustes means. This result allows for one-sample tests for a 
population Procrustes or Ziezold mean on the manifold part of the shape space, 
since due to strong consistency, sample Procrustes or Ziezold means will even- 
tually come to lie on the manifold part a.s. For a two-sample test, one would 
need to ensure that Procrustes and Ziezold means are contained on the manifold 
part, if the underlying random shapes are a.s. contained i n the manifo l d par t. 
Settling this issue is the subject of a separate research, cf. Huckemann ( 2010f h 

While many means, different from classical intrinsic, extrinsic or Pro c rustes 
means are Frechet p-means (e.g. geometric medians of Fletcher et all ( 2008 ) 
and (penalized) weighted Procrustes means of Drvden et al. ( 2009bl )) it would 
be interesting to v erify whether a larger class of means, e.g. the semi-metrical 
mean introduced in I Schmidt et al.l ([20071 ) and successfully employed in computer 
vision also fall into this setup. 

Within the discussion of "consistency" , the modeling o f lDrvd en et al. (2009a) 
for diffusion tensor imaging has been extended to diffusion tensors mildly defi- 
cient in rank. To the knowledge of the author, this is the first framework allowing 
for statistical inference on non-regular diffusion tensors without utilizing a Eu- 
clidean embedding, say, in the space of symmetric matrices. Of course in many 
applications, the interest lies specifically in degenerate tensors because these 
indicate a strong directional flow. The finding that mildly rank-deficient diffu- 
sion tensors are only "mildly p erturbation inconsiste nt" may provi de for another 
motiva tion for the approach of[5 rvden et al. l (|2009allbT l. Following iDrvden et al.l 
(2009a)), we have used Procrustes means. For the underlying reflection shape 



space, extrinsic Schoenberg means qualify as well, they compute much faster 
than Procrustes or Ziezold means. As the most important advantage, the two- 



sample tests of lBhattacharval (|2008l ) can then be performed even including rank 
1 diffusion tensors. A drawback, however, may result from a possible insensi- 
tivity of S choenberg means to degeneracy thus yielding a smaller discrimination 
power (cf. Irluckemann (|2010h for a detailed discussion). These issues certainly 
warrant further research. 

Briefly compiling the findings on tree boles we can add to the well known 
fact that longest boles can be found in dense stands, it seems that the most 
cylindrical boles can be found in young stands with low competition. 



A geodesic perturbation model. As demonstrated in this work, for most 
practical applications in 3D avoiding degenerate configurations, perturbation 
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models may be considered compatible with the shape space's geometry for 
isotr opic error. Note that th e shapes of entire tree boles are nearly singular 
(cf. iHuckemann et aL ( 2010| )). For short bole frusta, however, such models 



may still serve intuition and for sufficiently small error provide an adequate 
approximation. As one of such consider the geodesic perturbation model 

x% = gi\i{^{si) + Ei) (9) 

with a straight line s — > j(s) = p + sv in configuration space such that 
locally all points on 7 are in optimal position to each other. For 3D frusta, the 
geodesic of cylinders has been used here, any other geodesic of specific frusta can 
be used similarly. Moreover, based on larger samples, a geodesic perturbation 
model within the space of elliptical-like frusta can be used to assess specific 
model parameters. In particular, a close investigation of such model parameters 
may lead to inference on the impact of environmental effects on the shape of tree 
boles building on artificially indu ced modifications of biological tree parameters 
as in lGaffrev and Slobodal (l2004 . 
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A Appendix: Proofs 

The Central-Limit-Theorem for Frechet p-means. With the notation of 
Section l3~2| suppose that p : Q x P — > [0,oo)isa map, p 2 smooth in the second 
component as specified below. In a local chart {(f), U) of P near _1 (O) € P, 
denote by grad 2 p{q,p) 2 the gradient of x H> p{q 1 4>^ 1 {x)) 2 and by H2p(q,p) 2 
the corresponding Hessian matrix of se cond order derivatives. Then similar to 
iBhattacharva and Patrangenaru (2005, p. 1230) consider the following integra- 



bility condition on a random variable X on Q at a location p e P: 

E(grad 2 p(X, p) 2 ) exists, 
COV(grad 2j o(X,^) 2 ) exists, ^ (10) 

K(H2p(X, v) 2 ) exists for v near p and is continuous at v = p. 

In analogy to condition consider 

3e > such that x h-> p(X, (f>'~ 1 (x)) 2 is smooth for \x\ < e a.s. . (II) 

The validity of (jTUf and (jlll) is independent of the particular chart chosen. If p 
is the Euclidean distance or p — d^ k , then (fTUl is valid if E(|| X\\ 2 ) < 00. 

If there is a discrete group H acting on P and p € P such that — {hp : 
h £ H}, we say that the Frechet population p-mean set is unique up to the 
action of H. 
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Theorem A.l. Suppose that p is a point in the Frechet p-mean set unique 
up to the action of H on a manifold P with respect to a continuous function 
p : Q x P — > M, p 2 smooth in the second component in the sense of Ul\) . 
satisfying strong consistency, where Q is a topological space. If 

for any measurable choice p n £ E n there is a sequence h n £ H such that 
Mn = h n p n — > p a.s., and if 

(i) X has compact support or 

(ii) the integrability conditions ilO\) are satisfied at p 

then for any measurable choice p n € E n p ^ there is a sequence h n G H such that 
[i-n = h n p n satisfies a CLT. In a suitable chart {<f>, U) the corresponding matrices 
from Definition Iff. 61 are given by 

A = E(H 2 p(X, p) 2 ), S = COV(grad 2 p(X, p) 2 ) . 

Proof. Obviously, if X has compact support then (|TU| is satisfied. Hence we may 
assum e the case (ii) . We adapt the ideas laid out in lBhattacharva and Patrangenaru 
1 20051 p. 1229-1230). Let D denote the dimension of P and consider a local 
chart (U,4>) near p = ^ _1 (0). We have a.s. eventually that p n € U, then 



<t>{Hn) 



— » a.s. Abbreviating g(x) = Y^j=i P(Xj,4> 1 (x)) 2 , gradg(a 



(gi(x), . . . ,gD{x)) T and Hg{x) for the Hessian we have by (fTTj) the Taylor ex- 
pansion 



— gradg(> n ) 
V". 



— gradg(O) 



1 



I (gradgi(tia' n )) 2 



\ 



V (gradg D (t D x„)) T ■ x n J 
/ (gradgi(tia; n ) - gradgi(0)) T • ^nx n \ 

V" « « ' 

V (giadg D (t D x n ) - grad g D (0)) T ■ ^/nx n J 

for suitable < t\, . . . , to < 1 a.s. In conjunction with the classical CLT, the 
first two conditions in (fTU)) ensure that 



: gradg(O) + -Hg(0) ■ + 



1 

gradg(O) = — V grad 2 p(Xj , ^ (0)) 2 



- g 



in distribution for some Gaussian matrix Q with mean E( grad 2 p{X, <f> 1 (0)) 2 ) = 
and covariance matrix COV ( grad 2 p(X, (fr 1 (0)) 2 ) . The third condition guar- 
antees the Strong Law of Large Numbers for the mean of random Hessians 

1 



1 

-Hg(0) = -J^^piXj^-^O)) 2 

Note that the k-th entry of i ( grad gi(tiX. 



E{H 2 p(X,fj,y) =: A a.. 



did. 



a 2 

9id k 



gradgi(0)J for 1 < i, k < D is 



which tends to zero a.s. again by the third condition of (fT0|) . In consequence 
A^frixn — > Q in distribution yielding the assertion. □ 
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Proof of Theorem 15.11 First in I we define an extension of the Cholesky fac- 
torization which yields that r and r s are well defined. Then in II we show the 
topological properties of r and r s . The other assertions of Theorem [5J] are then 
straightforward . 

I: Define an extension of the Cholesky factorization by P(m) 3a — vX 2 v T —> 
b = \f\v T = Qti — y u £ UT- (m) where A is a diagonal matrix with non- negative 
entries, v,g € SO(m). Then, obviously u T u = a. The factorization b = gu is 
obtained by the usual Gram-Schmidt process, i.e. the first column g\ of g is 
6(i)/||&(i)|| where &(i) denotes the first non-zero column in &, the second column 
is 92 = (b(2) — 9ifr(^)5i)/||fr(2) — Sifr(2)5i|| with ^(2), the first column in b linearly 
independent of gi, etc.. Either proceed until g m (note that we have u mm > 
in consequence of det(6) > and g G SO(m)) or extend with arbitrary columns 
such that g € SO(m). In order to see that this mapping is well defined suppose 
that v! £ UT^(m) and g' e SO{m) with u' T u' = uu T . Set 

1 \ „./ ( l' 



J ' \ 

with a matrices q, q' € M(r, m) of rank r. Then, there are indeces 1 < j% < 
■ • ■ < jr such that the matrix (f = (q^ , . . . ,qj r ) composed of linearly independent 
columns qj t , . . . , qj r of g is upper triangular and regular, i.e. q € UT + (r). De- 
noting by (f = (q J j i , . . . , q'j r ) the matrix composed of the corresponding columns 
of q 1 note that by construction <f g UT(r). By hypothesis, q lT (f = which 
yields with the classical Cholesky decomposition that <f = q. Since q 17 q' = (fq 
with standing for the complementary columns in {1, . . . , r} we have indeed 
q' = q, i.e. u' = u. Obviously this mapping is an extension of the classical 
Cholesky factorization. 

The bijective mappi ngs <f> : S^ +1 -» SUT (m) and <f> s : SY>™ +1 UT(m) are 
similarly obtained (cf. iKendall et al. | (|l999l Section 1.3)) by a Gram-Schmidt 



decomposition of a e [a] e S'E™ +1 . To this end choose the sign of u mm such 
that g € SO(m) which possibly gives a negative entry u mm . Note that 

^oT s (F(ri))=[/T^(m) and o r(P(n)) = SUT^(m) . 

This shows that r and r s are indeed well defined. Moreover, r s is injective. 

II: We no w rely on the decomp osition of UT(m) and SUT(m) as simplicial 
complexes bv lKendall et al.l ( 19991 Chapter 2) defined through the topologies of 



.SI]™" 1-1 and respectively. Verify that the boundary identification on the 

respective subsets UT-(m) and SUT-(m) is the same as given by the natural 
topology of P(m). Hence, r and r s are both open and continuous. □ 

Lemma A. 2. Let e E F, be the unit matrix and a = (e + e) T (e + e) with 
error e € F™ +1 independent standard normal in every component. Then, 

E(chol(a)) ^ e . 

Proof. Let e + 5 = chol(a) giving for the entry in the first column and row 



-<5n = l + 2eu + 



1 
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Since \JJT 



y 2 + y/JT— x) 2 + y 2 > 2y 1 x 2 + y 2 observe that 



E 



(i+^n) 2 +E' 

i=2 



> E 



j'=i 



t(t) 



> 1 



for m > 1. Hence E(e + <5) = e cannot be. 



□ 



Proof of Theorem \6.1\ Since the diagonal of Wa^^^w^, p, r , t , consists of posi- 
tive values only , all el ements of P5 a ,b are mutually unique in optimal position 
dKendall et all <|l999L p. 114)). Since straight lines are mapped under the 
Helmert sub-matrix to straight lines in the configuration space which project to 
great circles in the pre-shape space, the straight line segment between w a ,p,r,t 
and w a i£ipi t t' maps to a segment on a horizontal geodesic on S^. In conse- 
quence, the projection of F/S a ,b t o (T, 2k )* is a total l y geod esic submanifold, its 
horizontal lift is again PS'ab, cf. iHuckemann et all (|2010l Section 2). □ 
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